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Abstract 

Various modularity matrices appeared in the recent literature on network analysis and alge¬ 
braic graph theory. Their purpose is to allow writing as quadratic forms certain combinatorial 
functions appearing in the framework of graph clustering problems. In this paper we put in 
evidence certain common traits of various modularity matrices and shed light on their spec¬ 
tral properties that are at the basis of various theoretical results and practical spectral-type 
algorithms for community detection. 
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1. Introduction 

Consider the following problem: We have a group of individuals, objects, or documents, 
bound together by a kind of reciprocal similarity relationship, and we want to localize a cluster, 
a tightly knit subset of such group that can be recognized as a “community”, in some sense. 
In the common terminology of network science, this is an example of a community detection 
problem [12, 24]. In fact, community detection problems are among the most relevant problems 
in the analysis of complex networks. 

Networks are widely used to model a large variety of real life systems and appear in many 
fields of scientific interests. Community detection and graph clustering methods may reveal 
many significant network properties and, as a consequence, are receiving a considerable amount 
of attention from various research areas, see e.g., [3, 8, 14]. One of the most popular method 
for community detection is that of modularity. The idea was proposed by Newman and Girvan 
in [19] and is essentially based on the maximization of a function called indeed modularity. 
However there is no clear or universally accepted definition of community in a graph; despite of 
this, almost any recent definition or community detection method is based on the maximization 
of a quadratic quality function related with the original modularity, see for instance, [1, 22, 23]. 

In this paper we basically propose a unified framework for a number of modularity-type 
matrices and functions borrowed from recent literature on community detection, and we analyse 
their spectral properties that are of possible interest for community detection methods. In 
particular, we prove a modularity-oriented version of a well known theorem due to Fiedler [11, 
Thm. 3.3] that holds for the Laplacian matrix of a graph. Our theorem holds for any negative 
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semidefinite rank-one perturbation of a symmetric matrix A with nonnegative off diagonal 
entries, and can be used to ensure the connectivity of the modules generated by the best known 
algorithms for community detection inspired by the renowned spectral partitioning method. 

This paper is organized as follows. After introducing hereafter our main notation, in Section 
2 we present briefly a number of topics arising in graph clustering literature, which provide 
several relevant examples where our concept of generalized modularity matrix comes from. In 
the subsequent Section 3 we prove our main result, which shows that a certain nodal domain of a 
leading eigenvector of a generalized modularity matrix is connected. In the successive sections 
we deepen the study of spectral properties of generalized modularity matrices. In fact, we 
consider the identifiability of a prescribed cluster as a nodal domain of the leading eigenvector 
(Section 4), the increase of the largest eigenvalue due to a newly added edge (Section 5), and 
the relationship between positive eigenvalues of a modularity matrix and the number of distinct 
clusters that can be recognized in a given network (Section 6). Finally, Section 7 is used to 
point out some conclusive remarks. 

LJ. Notations and preliminaries 

A symmetric weighted graph G is a pair (V, E) where 14 is a finite set of nodes (or vertices), 
and E : V X V R>o is a nonnegative weight function defined over edges, that is, node pairs, 
where E{i,j) = E{j,i)- In practice, edges with larger weights represent stronger connections 
among nodes, so missing edges get weight 0. If E{i, i) > 0 then we have a loop on node i. Any 
graph considered in the following is assumed symmetric, weighted, and connected. Since V is 
finite we freely identify it with {I,..., n}. 

There exists a natural bijection that associates to any graph G a componentwise nonnegative, 
irreducible, symmetric matrix A = (a^), called adjacency matrix, defined by = E{i,j)- 
Further relevant notation is listed below. 

• For any i gV, di denotes its degree, di = The vector of degrees of G is denoted 

hy d= {di,.. .,dnY. 

• For any S' C F we denote by S the complement V \ S and let vol S = XieS 
volume of S. In particular, vol G = Xigy i® the volume of the whole graph. 

• For any S C 14, if X is an n x n matrix then we denote by X{S) the principal submatrix 
of X whose indices are in S. Analogously, we denote by G{S) the subgraph of G induced 
by nodes in V, that is the graph whose adjacency matrix is A(S). 

• Let 1 denote the vector of all ones whose dimension depends on the context. Furthermore, 
for any S C {1,..., n} we let Ig be its characteristic vector, defined as (ls)i = 1 if t G S 
and (ls)i = 0 otherwise. 

• The cardinality of a set S is denoted by |S|. In particular, |14| = n. 

• For a matrix A and a vector x, we write A > O or a; > 0 (resp. A > O or a; > 0) to 
denote componentwise nonnegativity (resp., positivity). 

• If A is a symmetric matrix then its eigenvalues are denoted by Ai(A) and are ordered as 
Ai(A) > • • • > A„(A), unless otherwise specified. 

We will freely use familiar properties of matrices such as the variational characterization 
of eigenvalues of symmetric matrices, Gershgorin’s eigenvalue localization theorem, and funda¬ 
mental results in Perron-Frobenius theory, see e.g., [2, 27]. For completeness, we recall hereafter 
some important facts concerning the symmetric eigenvalue problem: 


2 



• (Cauchy interlacing theorem) Let A € be a symmetric matrix and let Z € R”^(" 

be a matrix with orthonormal columns. Then, for alH = 1,... ,n — k, 

A.(A) > MZ^AZ) > Xi+k{A). (1) 

• Let A € R"^" be a symmetric matrix and let B € be a principal submatrix 

of A. Then, for all i = 1,..., n — /c, 

A.(A) > MB) > Mk(A). (2) 

• (Weyl’s inequalities) Let A be a real symmetric matrix of order n and v G R". Then, for 
i = 1,..., n — 1, 

Ai(^) > \iJ^i{A + uu^) > Ai+i(yl). (3) 


2. Motivations and overview 


The discover and description of communities in a graph is a central problem in modern graph 
analysis; an elementary overview of graph clustering problems and techniques is the survey [24]. 
Although intuition suggests that a community (or cluster) in G should be a possibly connected 
group of nodes whose internal connections are stronger than those with the rest of the network, 
there is no universally accepted definition of community. A survey of several proposed definitions 
of community can be found in [12]. However, as the author of that paper therein underlines, 
the definition based on the modularity quality function is by far the most popular one. The 
modularity function was proposed by Newman and Girvan in [19] as a possible measure of 
whether a subgraph of G is a cluster or not. They assert that a subset S' C H is a cluster if 
the induced subgraph G(S) contains more edges than those expected if edges were placed at 
random preserving node degrees. All such subsets are indeed those having positive modularity. 
Since no information on the connectedness nor the dimension of the clusters is given by subsets 
with positive modularity, we shall call such subgraphs not just communities but rather modules. 
Let us formalize such concept. Consider a graph G and the associated adjacency matrix A. 
The graph G may have loops, and edges may be weighted, so that A is a rather arbitrary 
nonnegative matrix. If d = Al is the degree vector and vol G = di is the volume of the 
graph, the Newman-Girvan modularity matrix of G is defined as [17, 18, 19] 


AIng = A — 


1 


vol G 


dd' 


( 4 ) 


and the modularity measure of a subset S' C H is usually given by the associated quadratic 
form 

Qng(S) = IsMngIs 

where Is denotes the characteristic vector of the set S C V. Thus modules are subgraphs G(S) 
such that (5ng(S) > 0. A module which is connected and has a considerable size is commonly 
considered as a good community candidate. Remark the equivalent formulas 

Qng(S) = MAIs - = ei„(S) - 


vol G 


vol G 


where 


ei„(S) = ijAls = ^ E{i,j) 
i,jes 


( 5 ) 
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is the overall strength of internal links. 

Besides the Newman-Girvan matrix, several generalized modularity matrices appear in the 
community detection literature, often in a rather hidden form. Indeed, in [9] we initially focused 
our investigations on MnGj but we realized afterward that a clear common structure is shared 
by a number of different modularity measures and matrices appearing in this scientific area. 
Thus we propose here a spectral analysis which uncovers common properties shared by all of 
them. A generalized modularity matrix is any negative semidefinite rank-one correction of a 
real symmetric matrix with nonnegative off-diagonal entries. We shall denote any such a matrix 
with the symbol M and we shall state a formal definition in the subsequent Section 3. 

In the remaining part of this section we shortly discuss various topics arising in the com¬ 
munity detection literature, presenting other modularity-type matrices and motivating the in¬ 
troduction of generalized modularity matrices in turn. 

2.1. Newman’s spectral method 

A major task in community detection is to look for a module in G having maximal modu¬ 
larity, briefly called a leading module in what follows. The probably best known methods for 
detecting a leading module are based on the idea of spectral techniques, firstly introduced in 
graph partitioning problems. 

Consider the set {0,1}” of n-dimensional vectors whose components are only 0 or I. Clearly 
Q* = maxgcy Qf^ciS) = max„g|o,i}n Now let ui,... ,Un be the (real) orthonormal 

eigenvectors of Mng, then Mng = and v'''Mt^GV = XiiM-!^G){uJv)'^. If 

V could be chosen to be proportional to ui then the sum would be maximized. However the 
constraint v G {0,1}" prevents us to such a simple choice and makes the optimization problem 
much more difficult. In fact it has been pointed out in several works, as for instance [17, 18], 
that it is extremely unlikely that a simple procedure exists for finding the optimal v G {0,1}". 
Spectral partitioning based methods essentially select v accordingly with the sign of the elements 
in ui, by setting = 1 if {ui)i is positive (or nonnegative), and Vi = 0 otherwise. Then the 
vertex set V is partitioned into P = {i £ V \ Vi = 1} and N = P, and G{P) is proposed as an 
approximation of the module having maximal modularity in G. 

Although the described procedure proposes the subgraph G{P) as a leading module, it 
can been shown that either G{P) or G{N) are connected subgraphs of G, depending on the 
orientation of ui [9, Thm. 4.2]. However (and unfortunately) if the sign of ui is chosen so that 
G{P) is connected, it is not possible to ensure that G{N) is connected as well, at least in the 
general case. Counterexamples are given in [9] and in the subsequent Remark 3.6. 

The described procedure provides a reasonably good bipartition of G. Typical networks, 
however, require a division into more than two parts, so a natural extension of the spectral 
method described so far has been proposed. Such idea was probably introduced by Newman 
in [18] and is at the basis of most of the modern algorithms for communities detection, see e.g. 
the renowned Louvain method [3]. We call this procedure Successive Spectral Graph Bipartition 
algorithm (SSGB) and we briefly sketch it hereafter. 

The spectral method previously described is used to divide the network into two parts P 
and N, so that V = P U N. Then those parts are bipartitioned again into Pi, Ni, P 2 , and 
N 2 so that P = Pi U A^i, N = P 2 U N 2 , and so forth. The crucial step here is that each time 
the modularity matrix for the subgraphs G(Pi) and G{Ni) must be considered, and of course 
it can not be done by simply considering the principal submatrices Mng(^i) and Mng)-^^) 
respectively, since the degrees of vertices in the subgraphs change when some edge is removed. 
Instead, for each subset S CV and respective subgraph G(S'), a new modularity matrix M^q 
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is defined by setting 


Mia = Mng(^) - {Dcis) - ^D{S)) (6) 

where D is the diagonal matrix of the degrees of original graph G, whereas -Dg(S) is ths diagonal 
matrix of the degrees of the considered subgraph G{S). The SSGB procedure stops when the 
computed modularity matrix M^q has no positive eigenvalues. It is worth noting that already 
this very crucial procedure generates matrices whose structure is quite different with respect the 
structure of Mng, due to the diagonal term. Thus the connectedness of subgraphs it produces is 
not ensured anymore. However, all the matrices therein considered are generalized modularity 
matrices, as we will better discuss throughout the end of Section 3. 

2.2. A normalized variant of M-^q 

Let D = Diag(di,..., dn) be the diagonal matrix of the degrees of the graph G. In analogy 
with the renowned normalized Laplacian matrix of a graph [5], the normalized version of the 
Newman-Girvan modularity matrix is defined by 

Mnorm = . 

Even though that matrix is not very popular in the community detection literature, Mnorm 
appears in various networks related questions as the analysis of quasi-randomness properties of 
graphs with given degree sequences, see e.g., [4] or [5, Chap. 5]. It is straightforward to see that 
the modularity measure induced by Mng can also be defined as a quadratic form associated 
with Mnorm. In fact, if v = then 

(5ng('S') = IJMngIs = V^Mnorml' ■ 

The effect of the diagonal scaling becomes apparent when considering Rayleigh quotients instead 
of quadratic forms. Indeed, 1 JMng1s/1s1s = Qng(« 5')/|*S'|, whereas 

V^MnormV _ IgMNGHg _ Qng(<S') 
v''~v IgDls vol S 

Note that Hnorm = is a nonnegative irreducible matrix to which corresponds the 

symmetric weighted graph Gnorm = (YiE) whose weight function is E{i,j) = E{i, j)/^ydidj. 
Therefore Mnorm and Mng share the crucial property of being a negative semidefinite rank-one 
correction of the adjacency matrix of a graph. 

2.3. The matrix approach to the resolution limit 

Although modularity optimization techniques are very popular, recently it has been pointed 
out that they suffer a resolution limit, see e.g., [13, 15, 16]. In fact, it has been noted that 
modularity maximization algorithms are inclined to merge small clusters into larger modules. 
Various alternative modularity measures have been proposed in recent years, essentially based 
on the introduction of a tunable scaling coefficient 7 (usually called resolution parameter) or 
on the insertion of weighted selfloops. 

Starting from a statistical mechanics approach which interprets community detection as find¬ 
ing the ground state of a spin system, Reichardt and Bornholdt introduced in [22] a parametrized 
modularity measure for S QV. In our notations, that definition reads 

Q^{S) = e^S) - ( 7 /V 0 I G)(vol S)^ 
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where 7 > 0 is the resolution parameter and ein(<S') is as in (5). We observe that, introducing 
the matrix 

Mrb = ^ - ( 7 /V 0 I G)dd^, (7) 

then Q^{S) = IJMrbIs, and when 7 = 1 then we recover Newman-Girvan modularity matrix 
(4). Also from statistical mechanics considerations the parametrized modularity function 

Q^iS) = einiS) - ^\S\^ 

has been considered by Ronhovde and Nussinov in [23] as well as other authors, see e.g., [21, 26], 
possibly with minor notational variations or scaling factors. By defining the matrix 

AIrn = A — 711 ^ 


we can express the previous modularity function as Q-y{S) = IJMrnIs- Another approach has 
been proposed in [1] by Arenas, Fernandes and Gomez. The following matrix is suggested as 
an alternative to the original Newman-Girvan modularity: 


Mafg 


A + 7/ 


{d + 7 l)(d + 71 )^ 
"fn + vol G 


where 7 G R is the resolution parameter, n is the number of nodes of the graph and d is the 
degree vector as usual. Note that the matrix A + 7/ is the adjacency matrix of G where a self¬ 
loop with weight 7 is added to each node, so that Mafg is nothing but the Newman-Girvan 
matrix of the graph updated by the added loops. 


2.4- Generalized modularity matrices and measures 

Motivated by the aforementioned definitions, we consider the following generalization of the 
Newman-Girvan modularity matrix: 

Definition 2.1. Let A be the the adjacency matrix of an undirected, connected graph, possibly 
endowed by loops and weighted edges, let W be a real diagonal matrix, let v ^ 0 be a nonnegative 
vector, and let a be a positive scalar. The matrix M = A-\-W — avv'^ is a generalized modularity 
matrix. 

According to Definition 2.1, it is clear that all previously defined modularity-type matrices 
MnGj Mnorm, Mrb , Mafg and Mrn are indeed generalized modularity matrices. 

Hereafter, we adopt the notation Q{S) to indicate the modularity measure of S' C H corre¬ 
sponding to (or induced by) a given generalized modularity matrix M, that is, 

Q{S) = llMls. 

Remark that, if G = {V,E) and W = Diag('u;i... ,Wn) then the resulting expression for the 
modularity of S is 

Q{S) = ein{S) +'^Wi - a('^ vt 

i€S ^ i&S 

where ein(S) is as in (5). Thus, the diagonal matrix W establishes a weight on each node; and 
the value of Q{S) includes the sum of all node weights in S. 

Finally, as it will play a crucial role in forthcoming discussions, we borrow from [9] the 
notation 

me = Ai(M) 
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to denote the leading (i.e., rightmost) eigenvalue of a generalized modularity matrix M associ¬ 
ated to the graph G. Owing to the inequality 


mn = max 

x^O 


x^Mx 

x^x 


> 


IgMlg 


Q{S) 

1^1 


the existence of a module in G implies that mo > 0. Moreover, me is an upper bound for the 
“relative modularity” (3(5')/|S'|. 

Furthermore, for any two disjoint subsets S,T C V we will consider their joint modularity 
Q{S, T) = IgMlg’. Note that Q{S U T) = Q{S) + Q(T) + 2Q{S, T). In particular, Q{S U T) > 
Q{S) + Q{T) if and only if Q{S, T) > 0. 


3. Nodal domains of leading eigenvectors 

Given a nonzero vector v £ R” the subgraph G(S) induced by the set S' = {t : > 0} is a 

nodal domain of v [6, 7]. This fundamental definition admits obvious variations (for example, 
inequality can be strict, or reversed) and, since the seminal papers by Fiedler [10, 11], it has 
become the a major tool of spectral methods in community detection and graph partitioning 
[17, 20, 24]. Indeed, nodal domains of eigenvectors of Laplacian or modularity matrices are 
commonly utilized in order to localize subgraphs having sought properties. 

In this section we consider nodal domains of the leading eigenvector of generalized modu¬ 
larity matrices. In particular, the forthcoming Theorem 3.5 is a modularity matrix counterpart 
of Fiedler’s theorem [11, Thm. 3.3] about Laplacian matrices. 

Lemma 3.1. Let A> O be irreducible and let W be any real diagonal matrix. Then \i{A + W) 
is simple and admits a positive eigenvector. 

Proof. As IF is a real diagonal matrix, there exists a nonnegative scalar a such that the shifted 
matrix A + IF + a/ is nonnegative and irreducible. Then the Perron-Frobenius theorem implies 
the thesis. □ 

Lemma 3.2. Let M = A + IF — avv^ be a generalized modularity matrix. Then ma < 
Ai(A + lF). 

Proof. Weyl’s inequalities (3) give me < Ai(A + IF). Suppose by contradiction ma = Ai(A + 
IF). Let X and y be eigenvectors corresponding to me and Ai(A + IF), that is, Mx = max 
and (A + W)y = Ai(A + W)y. By Lemma 3.1 we can suppose y > 0. Hence, 

mex^y = x^My = x^(A + IF — avv~'^)y = Ai(A + W)x^y — (T{x^v){v^y). 

Thus a{x~’^v){v~’^y) = 0. Since a{v^y) > 0 we must have x^v = 0. Then, Ai(A + lF)x = max = 
Mx = (A + IF —crvu^)a; = (A + lF)a;. Consequently, x is an eigenvector of A + IF corresponding 
to its first eigenvalue. Lemma 3.1 implies either a; > 0 or x < 0. In both cases = 0 cannot 
hold. □ 

Interlacing properties between the spectra of A + IF and M lead us immediately to the 
inequalities 

A2(A + IF) < me < Ai(A + IF). 

The previous lemma shows that the rightmost inequality is always strict. The next statement 
clarifies that, under common circumstances, also the leftmost inequality is strict and me is a 
simple eigenvalue of M. 
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Theorem 3.3. If M = A + W — avv^ is a generalized modularity matrix and v is not a leading 
eigenvector of A + W then me is a simple eigenvalue. 

Proof. For an arbitrary vector x we have x^Mx = x^{A + W)x — a(v^x)^. From Courant- 
Fisher’s minimax theorem, 

x^Mx x^{A + W)x 

me = max —=— > max -==- 

x^o x'x v^x=o x'x 

^ • x~’'{A + W)x 

> mm max -=-= X 2 iA + W). 

z#0 z^x=0 X ' X 

Thus we may have mo = A 2 (A + W) only if the two preceding inequalities hold as equalities, 
that is, v^x = 0 where x is an eigenvector of M associated to me, and (A + W)v = \i{A + W)v, 
owing to orthogonality of eigenvectors of a symmetric matrix. However, if the latter equation 
is verified, then v is also an eigenvector of M. Indeed, 

Ai(H + W)v = (H + W)v = Mv + a{v~’^v)v, 

whence Mv = (Ai(H + W) — av~’^v)v. Consequently, the equation v^x = 0 is redundant, again 
owing to orthogonality of eigenvectors. Finally, if me is not simple then M has at least two 
eigenvalues strictly greater than X 2 {A + W), which contradicts Weyl’s inequalities (3), and the 
proof is complete. □ 

Lemma 3.4. Let M = A + W — avv^ be a generalized modularity matrix. Let Mx = max 
and the eigenvector x oriented so that v^x > 0. Then, S' = {i : x* > 0} induces a connected 
subgraph. 

Proof. By hypotheses, we have the componentwise inequality max = Mx = {A + W)x — 
{av'’'x)v < (T + W)x. 

By contradiction, assume that S induces 2 disjoint connected subgraphs, say G'(Si) and 
G(S 2 ). Reorder and partition consistently A, IF, M, and v in such a way that Si = {!,..., ni}, 
and S 2 = {ni + 1,..., 71 . 2 }. Consider the first n 2 equations in the inequality max < (A + IF)x: 

fmGXi\ ^ Mil + IFii AisX M I ^ /^iia;i + IFiiXi + ^ 13 X 3 ^ 

\mGX 2 J ~ \ H 22 + IF 22 A 23 J \^22a;2 + hF22a;2 + ^232^3/ 

Note that X 3 < 0 and His ^ O hy irreducibility, for i = 1,2. In particular, we have strict 
inequality in at least one entry both in Si and in S 2 . Let yi and 1/2 be left eigenvectors of 
All + Wii and A 22 + W 22 , respectively such that: yJiAu + Wu) = Xi{Au + Wii)yJ for 7 = 1 , 2 . 
Then, 

mGvJxi < yJ{Aii + Wii)xi + yjA^xz < yJiAu + Wii)xi = Ai(M + Wii)yjxi, 

for 7 = 1,2. Obviously, yjxi > 0 since j/i > 0 by Lemma 3.1 and x^ > 0 by hypothesis. Actually, 
due to the strict inequality above, we must have yJxi > 0. Thus both Ai(Aii + IFn) > me 
and Ai(A 22 + IF 22 ) > me- By eigenvalue interlacing inequalities (2), we conclude that A + W 
has at least 2 eigenvalues strictly larger than me, which contradicts Weyl’s inequalities (3). □ 

We can continue the argument in the previous proof as follows. Let y be any vector such 
that {A + W)y > mey. For example, y can be a positive eigenvector of Ai(A + IF) since by 
Lemma 3.2 we know that Ai(A + IF) > mq. Let x + y = z. Thus mez < (A + IF)z and, with 
arguments analogous to the ones exploited before, we obtain the following result. 



Theorem 3.5. In the same hypotheses and notations of Lemma 3-4, let y be a positive eigen¬ 
vector of A-\-W corresponding to \i{A-\-W). Then, for any £ > 0, the set S = {i : Xi-\-syi > 0} 
induces a connected subgraph. 

Remark 3.6. A connectedness result concerning the set S' = > 0} where x is an eigen¬ 

vector as in the hypotheses of Lemma 3.4 can be obtained only under the additional assumption 
that mo is simple. Indeed, consider the following example: Let G be a star graph on n = m-\-l 
nodes, with every node endowed by a loop carrying the weight ySn. Its adjacency matrix is 


A = 




\ 1 






Easy computations show that -Mng ho-s an (jn — l)-fold leading eigenvalue mo equal to \/m; 
every associated eigenvector is a zero-sum vector vanishing at the star center. Consequently, if 
X is a leading eigenvector of then S = : Xi > 0} is connected if and only if it reduces to 

a single node. We will not pursue here this argument, and point the interested reader to Section 
4 of [9]. 


An applications to community detection 

In this subsection we describe a major application of Theorem 3.5 to the community de¬ 
tection problem through the following Corollary 3.7. First of all let us underline that as soon 
as the modularity measure is induced by generalized modularity matrix M, it is reasonable to 
consider the SSGB procedure (see Section 2) applied to M, in order to subdivide the graph into 
modules. However the dehnition of the matrix in (6) is not always well posed. The matrix M^q 
therein considered is the Newman-Girvan modularity matrix associated to the subgraph G{S) 
induced by S. However, the structure of a generalized modularity matrix M = A -I W — avv^ 
might be only partially defined in terms of G, as W, v and a may be arbitrary. Let us agree now 
that, if this is the case, then we denote with the principal submatrix of M whose indices 
are in S. Otherwise let denote the generalized modularity matrix defined in terms of G{S). 
With this notation the SSGB scheme survives unchanged when Mng is replaced by a generic 
M. The next corollary shows that Theorem 3.5 gives us informations on the connectivity of 
the modules produced by the SSGB method, whenever the modularity measure is induced by 
a generic M. 

Corollary 3.7. Let M be any generalized modularity matrix. The spectral method applied to M 
generates a pair of subgraphs, one of which is certainly connected. Similarly, the SSGB method 
applied to M generates m subgraphs, half of which is connected. 

Proof. Let M = A-\-W—avv^. It is enough to observe that, if u is the eigenvector corresponding 
to the larger eigenvalue tog oriented so that u^v > 0 then, due to Theorem 3.5, the set 
S' = {t : Mi > 0} dehnes a bipartition of the node set such that G{S) is connected. However, 
we have no apriori control on the connectivity of the other set of the bipartition. 

A similar argument proves the thesis for the SSGB scheme. Due to the successive biparti¬ 
tions, the algorithm produces to/ 2 pairs of subsets. To be precise, at each step of the scheme, 
a subset S C H is given, then the matrix is computed and S is partitioned into a pair of 
subsets being identified by the sign of the entries of the leading eigenvector of . Note that, 
for any generalized modularity matrix M, the matrix M'® has the form = A' -\- W' R, 
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where A' is a nonnegative symmetric matrix, W' is diagonal and i? is a negative definite rank 
one matrix. Observe now that, if us is an eigenvector corresponding to the largest eigenvalue 
of , and its sign is chosen appropriately, then the hypothesis of Theorem 3.5 are satisfied. 
As a consequence the set 5 = {i G S' : {us)i > 0} induces a connected subgraph in G(S), thus 
a connected subgraph in G. □ 

4. A criterion for the leading eigenpair 

The classical Perron-Frobenius theory has been extended in various ways to matrices having 
some negative entries. One of such extensions, found in [25], allows us to predict the sign pattern 
in the leading eigenvector of a generalized modularity matrix. 

Lemma 4.1. Let P G be a symmetric matrix. //l^Pl > — 1)^ + 1||P||f then p{P) 

is an eigenvalue of P which is simple and associated to a nonnegative eigenvector. 

Proof See [25, Thm. 4.1]. □ 

Remark that the preceding lemma makes no assumptions on signs and sizes of the entries 
of the matrix P. In fact, various examples shown in [25] illustrate that the hypotheses of this 
lemma can be fulfilled by matrices having some negative entries. 

Theorem 4.2. Let M G IR"^" be a generalized modularity matrix. Let S C V be a set fulfilling 

the inequality _ 

Q{S) + Q(S) - 2Q(5, 5) > ^{n-iy + l\\M\\F. 

Then p{M) = me is a simple eigenvalue of M which is associated to an eigenvector x with the 
following property: S' = {i : Xj > 0}. 

Proof. Let J be the diagonal matrix such that J^i = 1 if i G S and Ju = —1 otherwise. 
Moreover, let P = JMJ. Observe that 

l^Pl = (Is - %)^M(ls - %) = Q(S) + Q(S) - 2Q{S,S). 

On the other hand, ||P||f = |1AI||f- Finally, x is an eigenvector of M if and only if Jx is an 
eigenvector of P. Thus the claim follows from Lemma 4.1. □ 

However, since for a modularity matrix M the rightmost eigenvalue may not be equal to 
the spectral radius, it is useful to derive a weakened version of the previous theorem which 
considers the matrix pencil M + aL. 

Corollary 4.3. Let M G IR"^" be a generalized modularity matrix. Let S CV be a set fulfilling 
the inequality 

Q{S) + Q(S) - 2Q{S,S) > V(n- 1)2 + 1||M + a/||F - na 

for some a G IR. Then the rightmost eigenvalue of M is simple and associated to an eigenvector 
X such that S = : Xi > 0}. 

Proof. Repeat the argument in the previous proof with the matrix M replaced by M + aL. 
Note that, in this case, l^Pl = Q{S) + Q{S) — 2Q{S, S) + na. □ 

Observe that the effect of introducing the shift M+aL is twofold: If a > 0 then the spectrum 
of M is translated to the right, and the rightmost eigenvalue may become the spectral radius 
of the shifted matrix. Moreover, the Frobenius norm of the shifted matrix may be smaller than 
that of M, for example, when the graph has no loops; in that case, the diagonal of M is negative 
and ||M||f can be decreased by means of a small positive shift. Indeed, note that \\M + a/||F 
is minimum when a = —trace(M)/n. 
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5. Sensibility under small perturbations 


Let G = {V,E) be a given graph and let S' C F be a set with Q{S) > 0 where Q is the 
modularity function induced by Mng- Let {i,j) be an edge missing in G, and let G' = {V,E') 
be the graph obtained by adding to G that edge: E'{i,j) > 0. It is not difficult to verify that, 
due to the new edge, 

• if both i G S and j G S then Q{S) increases, 

• if t G S and j ^ S then Q{S) decreases. 

Analogous behaviours can be observed by using other modularity-type functions, among those 
recalled in Section 2. Indeed, in some sense, in the first case the new edge increases the 
internal connection, and S becomes a stronger community than before; while in the latter case 
S becomes less separated from its exterior, hence it is less recognizable as a community. It is 
natural to ask whether that “monotonicity property” of the modularity function is somewhat 
preserved by mo- Indeed, from our standpoint, it makes sense to observe the variation of the 
rightmost eigenvalue mo of M after a small increment on the weight of one of the edges of 
G. Accordingly, in place of the conditions like i G S or i ^ S, we consider the sign of the i-th 
entry of a corresponding eigenvector of M. In fact, nodal domain based methods employ signs 
of eigenvector entries to locate possible communities: a positive value indicates that the vertex 
belongs to a cluster and a negative value that it is outside the cluster. 

Of course if M = A + W — avv^ is a generalized modularity matrix for G, the modularity 
matrix M' = A' + W — a'v'v'~’^ for the new graph G' should be defined properly. Although it is 
clear what A' is, the matrices W and a'v'v'^ may have not a clear definition. For definiteness, 
we consider the following assumption: If the graph G is perturbed by adding a weight e > 0 
to the edge (i,j) then W' = W and there exists a symmetric matrix E such that a'v'v'~’^ = 
a{I + E)vv^{I + E) and ||i ?||2 < rje for some r]. That is, we assume that the rank-one term in 
M' is a small relative perturbation of that in M. That assumption is fulfilled in practice by all 
modularity-type matrices introduced in Section 2. Note that it is possible to consider as E the 
diagonal matrix whose diagonal entries are 


Eii — 


- ^/avi 


i = 1,... ,n. 


A possible result along this direction is discussed throughout the remaining part of this section. 


Definition 5.1. Let Gq be a given graph, let i,jGV be a fixed pair of vertices, and let Gg 
be the graph obtained by adding the edge in {i,j) to Gq with weight e > 0. (Assume that, if 
{i,j) is an edge in Go then its weight in Gg is increased by s.) Let Mq and Mg be generalized 
modularity matrices of Gq and Gg, respectively. Define 


_ ma, - moo 
h'ij 

■' e 

Let Mo and Mg as in the previous definition. If Mg — Mq = e{eiej -I- ejeJ) and Ai(Mo) is 
simple then Ai(Me) varies according to the sign of XiXj, where x is a leading eigenvector of Mq, 
at least for sufficiently small e. In fact, from classical results in eigenvalue perturbation theory 
[27], in the stated hypotheses \i{Mg) is differentiable for small e, whence = X[{Mo) -I- o{e). 
Moreover, assuming that x is normalized, we have 

X[{Mq) = e~^x^{Mg — Mq)x = x^{eiej + ejej)x = 2xiXj, 
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showing indeed that = 2xiXj + o(e). Now consider the general case where, according to our 
previous assumption, we have 


Then, 


Mg — Mq = e{eiej + ejeJ) — a[{I + E)vv^{I + E) — 


x'^ {Mg — Mo)x = 2exiXj — a[x'''{I + E)vv^{I + E)x — x^vv^x] 

= 2exiXj — a[{x'^{I + E)vY — {v~''xY] 

= 2exiXj — a{x^{21 + E)v){v'’^ Ex). 

Let COS0 = (a;^v)/||u|| 2 ||a ;||2 be the cosine of the angle between x and v. Taking norms and 
assuming ||x ||2 = 1 as before, we obtain 

l/xL - 2 xiXj\ < 27y| cosd| cr||u|l 2 = 2 ? 7 | cos 6 >| ||ctuv^|| 2 , 

neglecting lower order terms. Thus, if rj and | cosd| are sufficiently small then has the same 
sign of XiXj. In particular, if the new edge is added between two nodes having the same sign 
in X then the algebraic modularity increases, and conversely, if XiXj < 0 . 


6. Positive eigenvalues and number of modules 

On the basis of rather informal arguments, Newman claims in [17, Sect. B] that the number 
of positive eigenvalues of Mng is related to the number of communities recognizable in the 
graph G. The subsequent Theorem 6.2, which generalizes an analogous result concerning the 
matrix Mng shown in [9, Thm. 6.2], proves that for any generalized modularity matrix and 
the modularity function Q associated to it, the number of positive eigenvalues of M is actually 
an upper bound for the cardinality of any family of pairwise disjoint modules in G having the 
property that, if any two modules are merged then the overall modularity does not increase. 

Lemma 6.1. Let Si,... ,Sk be k pairwise disjoint, nontrivial subsets of V, with k > 1. Let C 
he the k x k symmetric matrix with Gij = ij. Mlg^ where M is any modularity matrix. The 
number of positive (nonnegative) eigenvalues of M is not smaller than the number of positive 
(nonnegative, respeetively) eigenvalues of C. 

Proof. Consider the matrices Z = [Ig^ • • • and E = Diag(|S'i|,..., Note that 

Z = ZE has orthonormal columns. By Sylvester’s law of inertia, the number of positive 
(nonnegative) eigenvalues of C coincides with the number of positive (nonnegative, respectively) 
eigenvalues of EGE = Z^MZ. The claim follows by Cauchy interlacing inequalities (1). □ 

Given a family of pairwise disjoint subsets P = {S'!,..., 5'^} one usually defines the modu¬ 
larity of V as Q{V) = Q{Si). The maximization of the latter quantity is a recurrent task 
in community detection algorithms [17, 24, 26]. If each Si is a module, V maximizes QiV) and 
contains the least number of sets among all such families, then Q{Si, Sj) < Q ior i ^ j, other¬ 
wise we can reduce \V\ or increase QiV) (or both) by merging subsets whose joint modularity is 
nonnegative. In that case, the matrix C introduced in the preceding lemma has a sign pattern 
which is well known in the field of nonnegative matrices [2]. One possible consequence is stated 
in the forthcoming result, relating the number of positive eigenvalues of M to the number of 
disjoint modules in G that optimize the overall modularity. 
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Theorem 6.2. Let Si,...,Sk be k pairwise disjoint, nontrivial subsets of V, with k > 1. 
Suppose that, for all i = 1,... ,k, we have Q{Si) > 0 and Q{Si, Sj) < 0 for i ^ j. If there exist 
positive numbers ai,... ,ak such that 

j#* 

then M has at least k positive eigenvalues. 

Proof. Let C be the k x k symmetric matrix with Cij = Ig.Mlsj and let a = (oi,... ,ak)^. 
In the stated hypotheses Cu > 0, Cij < 0 for z j and Ca >0. By a classical result 
on nonnegative matrices [2, §6.2] (7 is a symmetric M-matrix, so in particular it is positive 
definite. The claim follows immediately from Lemma 6.1. □ 

It is worth noting that the condition on ai,..., in the previous theorem can be easily 
fulfilled when Si,...,Sk is a partition of V and M = Mng or M = Mafg, see Section 2. 
Indeed, in those cases we have Ml = 0 and, consequently, we can obtain the sought inequalities 
by setting oi = ... = = 1 as shown in the forthcoming corollary. 

Corollary 6.3. Let V = {S'!,..., Sp} be a partition ofV into pairwise disjoint subsets. Suppose 
that Q{Si) > 0 and Q{Si, Sj) < 0 for i ^ j, where Q is the modularity function associated to a 
generalized modularity matrix M such that Ml = 0. Then the number of positive eigenvalues 
of M is at least p — 1. 

Proof. Let Z = • • • Ig^j and C = Z^MZ = (Ig^Mlsf). Since Z1 = 1, by hypothesis we 

obtain Cl = Z^MZl = 0. Then, for z = 1,... — 1 we have 

Qis,) + Y, QiSi,Sj) = -Q{Si,Sp) > 0. 

j¥=i,P 

Using Theorem 6.2 with A: = p — 1 we obtain the claim. □ 

We close this section with the forthcoming theorem which states that, if G has k subgraphs 
that are well separated and sufficiently rich in internal edges (including loops), then M has at 
least k — 1 positive eigenvalues. This result extends Theorem 6.1 in [9] to arbitrary generalized 
modularity matrices. For better clarity consider that, if S and T are two disjoint subsets of V, 
then the number I^AIt corresponds to the total weight of edges joining nodes in S with nodes 
in T. 

Theorem 6.4. Let Si,..., Sk be pairwise disjoint subsets of V, with k > 1, such that 

CniSi) +lg.WlSi > ’^^Ig.Als,). 

jC 

Then M has at least k — 1 positive eigenvalues. 

Proof. Consider the matrices Z and S introduced in the proof of Lemma 6.1. Introduce the 
k X k matrix B = Z^{A + W)Z. We have Bu = ein{Si) + IgWlgi and Bij = lg,AlSj for 
i j. By hypothesis, B is nonnegative and strictly diagonally dominant, hence it is positive 
definite. Consider the matrix C defined in Lemma 6.1: 

C = Z~^MZ = Z'^{A + W- avv'^)Z = B- a(Zv)(Zv)'^. 

We see that C is a negative semidefinite, rank-one perturbation of B, hence it has at least k — 1 
positive eigenvalues. The claim follows from Lemma 6.1. □ 
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7. Conclusions 


Community detection is a major problem arising in modern complex network analysis, and 
modularity-type matrices and functions play a fundamental role in network science. In fact, 
several generalizations of the modularity matrix originally introduced by Newman and Girvan 
[17, 18, 19] appear in the complex networks literature, often in a rather hidden form. In 
this paper we put in evidence that a common structure and various spectral properties are 
shared by all these matrices. As the matrix theoretic approach to modularity based methods 
is very recent, several directions of investigation are left open. Relevant steps would be, in 
our opinion, to provide lower bounds for the modularity of graphs in terms of the spectrum 
of the associated modularity matrix, and robustness results of leading modules with respect to 
different modularity measures. 
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